JUN 3 0 2003 



Title: Modified Surface Antigen 
■ Inventor:^«Uchard Anselm Peak etai 
Appln. #: ^^71,382 Customer No.: 570 

Arty. Docket No.: 8795-24U1 



1/31 



EG327 MNKIYRIIWN SALNAWVAVS ELTRNHTKRA SATVATAVLA 

BZ198 MNKIYRIIWN SALNAWVWS ELTRNHTKRA SATVATAVLA 

BZ10 MNKISRIIWN SALNAWVWS ELTRNHTKRA SATVATAVLA 

HIS MNKIYRIIWN SALNAWVWS ELTRNHTKRA SATVATAVLA 

EG329 MNEILRIIWN SALNAWVWS ELTRNHTKRA SATVKTAVLA 

PMC21 MNKIYRIIWN SALNAWVWS ELTRNHTKRA SATVKTAVLA 

H38 MNKIYRIIWN SALNAWVAVS ELTRNHTKRA SATVKTAVLA 

P20 MNKIYRIIWN SALNAWVWS ELTRNHTKRA SATVATAVLA 

Z24 91 MNKIYRIIWN SALNAWVAVS ELTRNHTKRA SATVKTAVLA 

H41 . MNKIYRIIWN SALNAWVAVS ELTRNHTKRA SATVKTAVLA 

Consensus MN-I-RIIWN SALNAWV-VS ELTRNHTKRA SATV-TAVLA 
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EG327 TEKLSFSANS NKVNITSDTK GLNFAKKTAE TNG DTTVH LN GIGSTLTDTL 

BZ198 TEKLSFGANG NKVNITSDTK GLNFAKETAG TNGDPTVHLN GIGSTLTDTL 

BZIO TEKLSFGANG NKVNITSDTK GLNFAKETAG TNGDPTVHLN GIGSTLTDTL 

HIS TEKLSFGANG NKVNITSDTK GLNFAKETAG TNGDPTVHLN GIGSTLTDTL 

EG32 9 TEKLSFSANG NKVNITSDTK GLNFAKETAG TN G DTTVH LN GIGSTLTDTL 

PMC 21 TEKLSFSANG NKVNITSDTK GLNFAKETAG TNG DTTVH LN GIGSTLTDTL 

H38 TEKLSFGANG NKVNITSDTK GLNFAKETAG TNGDTTVHLN GIGSTLTDTL 

P20 TEKLSFGANG NKVNITSDTK GLNFAKETAG TNGDPTVHLN GIGSTLTDTL 

Z24 91 TEKLSFGANG KKVNI I SDTK GLNFAKETAG TNGDTTVHLN GIGSTLTDTL 

H41 TEKLSFGANG KKVNI ISDTK GLNFAKETAG TNGDTTVHLN GIGSTLTDML 

Consensus TEKLSF-AN- - KVNI - SDTK GLNFAK-TA- TNGD-TVHLN GIGSTLTD-L 
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2SO 

EG327 LNTGATTNVT NDNVTDDEKK RAASVKDVLN AGWNIKGVKP GTTAS DNV 
BZ198 LNTGATTNVT NDNVTDDEKK RAASVKDVLN AGWNIKGVKP GTTAS DNV 
BZIO LNTGATTNVT NDNVTDDEKK RAASVKDVLN AGWNIKGVKP GTTAS " DNV 
H15 LNTGATTNVT NDNVTDDEKK RAASVKDVLN AGWNIKGVKP GTTAS " DNV 
EG32 9 LNTGATTNVT NDNVTDDEKK RAASVKDVLN AGWNIKGVKP GTTAS " DNV 
PMC21 LNTGATTNVT NDNVTDDEKK RAASVKDVLN AGWNIKGVKP GTTAS * DNV 
H38 LNTGATTNVT NDNVTDDKKK RAASVKDVLN AGWNIKGVKP GTTAS DNV 
P20 AG S S AS HVD A GNQST..HYT RAASIKDVLN AGWNIKGVKT GSTTGQSENV 
Z2491 AGS S AS HVD A GNQST..HYT RAASIKDVLN AGWNIKGVKT GSTTGQSENV 
H41 LNTGATTNVT NDNVTDDEKK RAASVKDVLN AGWNIKGVKP GTTAS DNV 

Consensus A T RAAS-KDVL N AGWNIKGVX - G-T-- NV 

V 3 ~ C4 V4 
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EG327 DFVRTYDTVE FLSADTKTTT VNVESKDNGK RTEVKIGAKT SVIKEKDGKL 

BZ198 DFVRTYDTVE FLSADTKTTT VNVESKDNGK KTEVKI GAKT SVIKEKDGKL 

BZIO DFVRTYDTVE FLSADTKTTT VNVESKDNGK RTEVKIGAKT SVIKEKDGKL 

HIS DFVRTYDTVE FLSADTKTTT VNVESKDNGK KTEVKIGAKT SVIKEKDGKL 

EG329 DFVRTYDTVE FLSADTKTTT VNVESKDNGK KTEVKIGAKT SVIKEKDGKL 

PMC21 DFVRTYDTVE FLSADTKTTT VNVESKDNGK KTEVKIGAKT SVIKEKDGKL 

H38 DFVHTYDTVE FLSADTKTTT VNVESKDNGK RTEVKIGAKT SVIKEKDGKL 

P20 DFVRTYDTVE FLSADTKTTT VNVESKDNGK RTEVKIGAKT SVIKEKDGKL 

Z2491 DFVRTYDTVE FLSADTKTTT VNVESKDNGK RTEVKIGAKT SVIKEKDGKL 

H41 DFVRTYDTVE FLSADTKTTT VNVESKDNGK KTEVKIGAKT SVIKEKDGKL 

Consensus DFV-TYDTVE FLSADTKTTT VNVESKDNGK -TEVKIGAKT SVIKEKDGKL 
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501 . 550 

EG327 VKEGDVTNVA QLKGVAQNLN NH I DNVDGN A RAGIAQAIAT AGLVQAYLPG 

BZ198 VKEGDVTNVA QLKGVAQNLN NRI DNVDGNA RAGIAQAIAT AGLVQAYLPG 

B210 VKEGDVTNVA QLKGVAQNLN NRI DNVDGNA RAGIAQAIAT AGLAQAYLPG 

HIS VKEGDVTNVA QLKGVAQNLN N RI DNVDGNA RAGIAQAIAT AGLAQAYLPG 

EG329 VKEGDVTNVA QLKGVAQNLN NRI DNVDGNA RAGIAQAIAT AGLVQAYLPG 

PMC21 VKEGDVTNVA QLKGVAQNLN NRI DNVDGNA RAGIAQAIAT AGLVQAYLPG 

H38 VKEGDVTNVA QLKGVAQNLN NRI DNVDGNA RAGIAQAIAT AGLVQAYLPG 

P2 0 VKEGDVTNVA QLKGVAQNLN NRI DNVNGNA RAGIAQAIAT AGLAQAYLPG 

Z24 91 VKEGDVTNVA QLKGVAQNLN NRI DNVDGNA RAGIAQAIAT AGLVQAYLPG 

H41 VKEGDVTNVA QLKGVAQNLN NRI DNVNGNA RAGIAQAIAT AGLVQAYLPG 

Consensus VKEGDVT NVA QLKGVAQNLN N-IDNV- GNA RAGIAQAIA T AGL-QAYLPG 
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EG327 KSMMAIGGGT YRGEAGYAIG YSSISDGGNW I IKGTASGNS RGHFGASASV 

BZ198 KSMMAIGGDT YRGEAGYAIG YSSISDGGNW I I KGTASGNS RGHFGASASV 

BZ10 KSMMAIGGGT YRGEAGYAIG YSSISDTGNW VI KGTASGNS RGHFGTSASV 

HIS KSMMAIGGGT YRGEAGYAIG YSSISDTGNW VI KGTASGNS RGHFGASASV 

EG329 KSMMAIGGGT YRGEAGYAIG YSSISDGGNW 1 1 KGTASGNS RGHFGASASV 

PMC21 KSMMAIGGGT YRGEAGYAIG YSSISDGGNW I IKGTASGNS RGHFGASASV 

H38 KSMMAIGGGT YRGEAGYAIG YSSISDGGNW I IKGTASGNS RGHFGASASV 

P20 KSMMAIGGGT YLGEAGYAIG YSSISDTGNW VI KGTASGNS RGHFGTSASV 

Z2491 KSMMAIGGGT YRGEAGYAIG YSSISDGGNW I I KGTASGNS RGHFGASASV 

H41 KSMMAIGGGT YLGEAGYAIG YSSISAGGNW 1 1 KGTASGNS RGHFGASASV 

Consensus KSMMAIG G-T Y-GEAGYAIG YSSIS — GNW -IKGTASGNS RGHFG-SASV 
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EG327 GYQW. 

BZ198 GYQW. 

BZIO GYQW. 

HIS GYQW. 

EG329 GYQW. 

PMC21 GYQW. 

H38 GYQW. 

P20 GYQW. 

Z24 91 GYQW. 

H41 GYQW. 

Consensus GYQW. 
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HIS ATGAACAAAA TATArrfTAT 



BZ10 



ATGAACAAAA TATACC GC AT C AT TTGGAAT AGTGCCGTCA ATGC C TGGGT CGTCGTATCC GAGCTCACAr 
ATGAACAAAA TATCCCGCAT CAT TTGGAAT AGTGCC CTC A ATGC C TGGGT CGTCGTATCC GAGCTCACAC 
BZ198 ATGAACAAAA TAT AC C GC AT CAT TTGGAAT AGTGCC CTCA ATGC C TGGGT CGTCGTATCC GA^Scic 
P20 ATGAACAAAA TATACC GC AT CATTTGGAAT AGTGCC CTCA ATGC C TGGGT AGTCGTATCC GAGCTCACAC 
H38 ATGAACAAAA TATAC C GC AT CATTTGGAAT AGTGCCCTCA ATGCC TGGGT CGCCGTATCC GAGCTCACAC 
Z2491 ATGAACAAAA TATAC C GC AT CATTTGGAAT AGTGCCCTCA ATGCC TGGGT CGCCGTATCC GAGCTCACAC 
H41 ATGAACAAAA TATAC C GC AT CATTTGGAAT AGTGCCCTCA ATGCC TGGGT CGCCGTATCC GAGCTCACAC 
EG329 ATGAACGAAA TATTGC GC AT CATTTGGAAT AGCGCCCTCA ATGC C TGGGT CGTTGTATCC GAGCTCACAC 
PMC21 ATGAACAAAA TATACCGCAT CATTTGGAAT AGTGCCCTCA ATGCATGGGT CGTCGTATCC GAGCTCACAC 
EG327 ATGAACAAAA TATACCGCAT - CATTTGGAAT AGTGCCCTCA ATGCC TGGGT CGCCGTATCC GAGCTCACAC 
Consensus ATGAAC - AAA TAT— CGCAT CATTT GGAAT AG-GCCCTCA ATGC - TGGGT -G — GTATCC GAGCTCACAC 

^T 7 ci 
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"1 210 

HIS TCAGGC GAAT GCTACCGATG ACGAC GATTTA TATTTAGAAC CCGTACAACG C AC TGCTGTC 

BZ10 TCAGGCGAAT GCTACCGATG ACGAC GATTTA TATTTAGAAC CCGTACAACG CACTGCTGTC 

BZ198 TCAGGCGAAT GCTACCGATG ACGAC GATTTA TATTTAGAAC CCGTACAACG CACTGCTGTC 

P20 TCAGGCGAAT GC TACCGATA CCGAT GAAGATGAA GAGTTAGAAT CCGTAGCACG CTCTGCTCTG 

H38 TCAGGCGAAT GCTACCGATG AAGAT GAAGAAGAA GAGTTAGAAC CCGTAGTACG CTCTGCTCTG 

Z2 491 TCAGGCGAAT GCTACCGATG AAGAT GAAGAAGAA GAGTTAGAAT CCGTACAACG CTCTGTCGTA 

H41 TCAGGCGAAT GCTACCGATG AAGAT GAAGAAGAA GAGTTAGAAT CCGTACAACG CTCTG. . -TC 

EG32 9 TCAGGCAAGT GC TAACAATG AAGAGCAAGA AGAAGATTTA TATTTAGACC CCGTGC TACG CACTGTTGCC 

PMC21 TCAGGCAAGT GC TAACAATG AAGAGCAAGA AGAAGATTTA TATTTAGACC CCGTACAACG CACTGTTGCC 

EG327 TCAGGCGAGT AC TACCGATG ACGAC GATTTA TATTTAGAAC CCGTACAACG CACTGCTGTC 

Consensus TCAGGC-A-T -CTA-C-AT- — GA GA A -A-TTAGA — CCGT ACG C-CTG 
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CGACGATCAC 
CGACAATCAC 
AGGAATTTGT 
CAGCAAGGAA 
CAACGAGAAA 
CAACGAGAAA 
CGACAAGAAA 



AGAGT AC T AA 
AGAGT AC TAA 
AGAGT AC T AA 
AACACTCTAC 
AACACTCTAC 
AGACC CATAC 
T TT GT AG AC C 
GGAGTACTAA 
GGAGT AC TAA 
GGAGTACTAA 



AAGCCGGAGC 

AAGCCGGAGC 

AAGCCGGAGC 

ACGGCGCAAC 

ACGGCGCAAC 

ATAGTA. . . . 

CATACATAGT 

CAGCCAGAGA 

CAGCCAGAGA 

CAGCCGGAAC 



AATCACCCTC 
AATCACCCTC 
AATCACCCTC 
CGTTACCCTC 
CGTTACCCTC 
. GTTACCCTC 
AGTTACCCTC 
AATCACCCTC 
AATCACCCTC 
AATCACCCTC 
— T-ACCCTC 



AAAGCCGGCG 
AAAGCCGGCG 
AAAGCCGGCG 
AAAGCCGGCG 
AAAGCCGGCG 
AAAGCCGGCG 
AAAGCCGGCG 
AAAGCCGGCG 
AAAGCCGGCG 
AAAGCCGGCG 
AAAGCCGGCG 



VI 



3SO 

ACAAC CTGAA 
ACAACCTGAA 
ACAAC CTGAA 
ACAACCTGAA 
ACAACCTGAA 
ACAACCTGAA 
ACAACCTGAA 
ACAACCTGAA 
ACAACCTGAA 
ACAACCTGAA 
ACAACCTGAA 



C2 



351 

AATCAAACAA 
AATCAAACAA 
AATCAAACAA 
AATCAAACAA 
AATCAAACAA 
AATCAAACAA 
AATCAAACAA 
AATCAAACAA 
AATCAAACAA 
AATCAAACAA 
AATCAAACAA 

C2 



AACACCAATG AAAACACCAA TGAAAACACC AATGACAGTA GCTTCACC TA 
AACAC C AATG AAAACACCAA TGAAAACACC AATGACAGTA GCTTCACC TA 

AACACCAATG AAAACACC AATGACAGTA GCTTCACC TA 

AGCGGCAAAG A CTTCACCTA 

AACACCAATA AAAACACCAA TGAAAACACC AATGACAGTA GCTTCACCTA 

AACACCAATG AAAACACC AATGCCAGTA GCTTCACCTA 

AACACCAATG AAAACACC AATGCCAGTA GCTTCACCTA 

AAC C GCACAA ACTTCACCTA 

AAC G GCACAA ACTTCACCTA 

AACACCAATG AAAACACC AATGCCAGTA GCTTCACCTA 

^~ c : CTTCACCTA 

V2 



420 

CTCCCTGAAA 
CTCCCTGAAA 
CTCCCTGAAA 
CTCGCTGAAA 
CTCGCTGAAA 
CTCGCTGAAA 
CTCGCTGAAA 
CTCGCTGAAA 
CTCGCTGAAA 
CTCGCTGAAA 
CTC-CTGAAA 



C3 



FIG. 2B 




HIS 
BZ10 
BZ198 

P20 

H38 
Z2491 

H41 
EG329 
PMC21 
EG327 
Consensus 



HIS 
BZIO 
BZ198 

P20 

H38 
Z2491 

H41 
EG329 
PMC21 
EG327 
Consensus 



HIS 
BZIO 
BZ198 

P20 

H38 
Z2491 

H41 
EG329 
PMC2X 
EG327 
Consensus 
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CD 



CO 



421 490 
AAAGACCTCA CAGATCTGAC CAGTGTTGAA ACTGAAAAAT TATCGTTTGG CGCAAACGGT AATAAAGTCA 
AAAGACCTCA CAGATCTGAC CAGTGTTGAA ACTGAAAAAT TATCGTTTGG CGCAAACGGT AATAAAGTCA 
AAAGACCTCA CAGATCTGAC CAGTGTTGAA ACTGAAAAAT TATCGTTTGG CGCAAACGGT AATAAAGTCA 
AAAGAGCTGA AAGACCTGAC CAGTGTTGAA ACTGAAAAAT TATCGTTTGG CGCAAACGGT AATAAAGTCA 
AAAGACCTCA CAGATCTGAC CAGTGTTGAA ACTGAAAAAT TATCGTTTGG CGCAAACGGC AATAAAGTCA 
AAAGACCTCA CAGGC C TGAT CAATGTTGAA ACTGAAAAAT TATCGTTTGG CGCAAACGGC AAGAAAGTCA 
AAAGACCTCA CAGGCC TGAT CAATGTTGAA ACTGAAAAAT TATCGTTTGG CGCAAACGGC AAGAAAGTCA 
AAAGACCTCA CAGATCTGAC CAGTGTTGGA ACTGAAAAAT TAT C G TT TAG CGCAAACGGC AATAAAGTCA 
AAAGACCTCA CAGATCTGAC CAGTGTTGGA ACTGAAAAAT TATCGTTTAG CGCAAACGGC AATAAAGTCA 
AAAGACCTCA CAGATCTGAC CAGTGTTGGA ACTGAAAAAT TATCGTTTAG CGCAAACAGC AATAAAGTCA 
AAAGA-CT-A -AG — CTGA- CA-TGTTG-A ACTGAAAAAT TATCGTTT-G CGCAAAC-G- AA— AAAGTCA 

C3 



491 S60 
ACATCACAAG CGACACCAAA GGCTTGAATT TTGCGAAAGA AACGGCTGGG AC GAAC GGCG ACCCCACGGT 
ACATCACAAG CGACACCAAA GGCTTGAATT TTGCGAAAGA AACGGCTGGG AC GAAC GGCG ACCCCACGGT 
ACATCACAAG CGACACCAAA GGCTTGAATT TTGCGAAAGA AACGGCTGGG AC GAAC GGCG ACCCCACGGT 
ACATCACAAG CGACACCAAA GGCTTGAATT TTGCGAAAGA AACGGCTGGG AC GAAC GGCG ACCCCACGGT 
ACATCACAAG CGACACCAAA GGCTTGAATT TCGCGAAAGA AACGGCTGGG AC GAAC GGCG ACACCACGGT 
ACATCATAAG CGACACCAAA GGCTTGAATT TCGCGAAAGA AACGGCTGGG AC GAAC GGCG ACACCACGGT 
ACATCATAAG CGACACCAAA GGCTTGAATT TCGCGAAAGA AACGGCTGGG AC GAAC GGCG ACACCACGGT 
ACATCACAAG CGACACCAAA GGCTTGAATT TTGCGAAAGA AACGGCTGGG AC GAAC GGCG ACACCACGGT 
ACATCACAAG CGACACCAAA GGCTTGAATT TTGCGAAAGA AACGGCTGGG AC GAAC GGCG ACACCACGGT 
ACATCACAAG CGACACCAAA GGCTTGAATT TCGCGAAAAA AACGGCTGAG AC GAAC GGCG ACACCACGGT 
ACATCA-AAG CGACACCAAA GGCTTGAATT T - GCGAAA- A AACGGCTG— G AC - AAC GGCG A C-CCACGGT 

C3 

561 630 
TCATC TGAAC GGTATCGGTT CGACTTTGAC C GAT AC GC TG CTGAATACCG GAGCGAC C AC AAACGTAAC C 
TCATC TGAAC GGTATCGGTT CGACTTTGAC C GAT AC GC TG CTGAATACCG GAGCGACCAC AAACGTAACC 
TCATC TGAAC GGTATCGGTT CGACTTTGAC C GAT AC GC TG CTGAATACCG GAGCGACCAC AAACGTAACC 
T CATC TGAAC GGTATCGGTT CGACTTTGAC CGATACGCTT GCGGGTTCTT CTGCTTCTCA C GT TGAT GC G 
TCATC TGAAC GGTATTGGTT CGACTTTGAC C GAT AC GC TG CTGAATACCG GAGCGACCAC AAACGTAACC 
TCATC TGAAC GGTATCGGTT CGACTTTGAC CGATACGCTT GCGGGTTCTT CTGCTTCTCA CGTTGATGCG 
TCATC TGAAC GGTATCGGTT CGACTTTGAC C GATATGC TG CTGAATACCG GAGCGACCAC AAACGTAACC 
TCATC TGAAC GGTATTGGTT CGACTTTGAC C GAT AC GC TG CTGAATACCG GAGCGACCAC AAACGTAACC 
TCATCTGAAC GGTATTGGTT CGACTTTGAC C GAT AC GC TG CTGAATACCG GAGCGACCAC AAACGTAACC 
TCATC TGAAC GGTATCGGTT CGACTTTGAC C GAT AC GC TG CTGAATACCG GAGCGACCAC AAACGTAACC 

TCATCTGAAC GGTAT- GGTT CGACTTTGAC CGATA-GCT- — G — T-C — — GC — C G C- 

C3 V3 



FIG. 2C 
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570 



HIS 
BZIO 
BZ198 

P20 

H38 
Z2491 

H41 
EG32 9 
PMC2 1 
EG327 
Consensus 



631 

AACGACAACG 
AACGACAACG 
AACGACAACG 
GGTAACCAAA 
AACGACAACG 
GGTAACCAAA 
AACGACAACG 
AACGACAACG 
AACGACAACG 
AACGACAACG 
AC- A- - 



TTACCGATGA 
TTACCGATGA 
TTACCGATGA 
GTACACATTA 
TTACCGATGA 
GTACACATTA 
TTACCGATGA 
TTACCGATGA 
TTACCGATGA 
TTACCGATGA 
— TAC — AT -A 



CGAGAAAAAA 
CGAGAAAAAA 
CGAGAAAAAA 

C ACT 

CAAGAAAAAA 

C ACT 

CGAGAAAAAA 
CGAGAAAAAA 
CGAGAAAAAA 
CGAGAAAAAA 
C A-- 



CGTGCGGCAA 
CGTGCGGCAA 
CGTGCGGCAA 
CGTGCAGCAA 
CGTGCGGCAA 
CGTGCAGCAA 
CGTGCGGCAA 
CGTGCGGCAA 
CGTGCGGCAA 
CGTGCGGCAA 
CGTGC-GCAA 



GCGTTAAAGA 
GCGTTAAAGA 
GCGTTAAAGA 
GTATTAAGGA 
GCGTTAAAGA 
GTATTAAGGA 
GCGTTAAAGA 
GCGTTAAAGA 
GCGTTAAAGA 
GCGTTAAAGA 
G — TTAA-GA 



V3 



CGTATTAAAC 
CGTATTAAAC 
CGTATTAAAC 
TGTGTTGAAT 
CGTATTAAAC 
TGTGTTGAAT 
CGTATTAAAC 
CGTATTAAAC 
CGTATTAAAC 
CGTATTAAAC 
-GT-TT-AA- 



700 

GCAGGCTGGA 
GCAGGCTGGA 
GCAGGCTGGA 
GCGGGTTGGA 
GCAGGCTGGA 
GCGGGTTGGA 
GCAGGCTGGA 
GCTGGCTGGA 
GCTGGCTGGA 
GCAGGCTGGA 
GC-GG-TGGA 



701 

HIS ACATTAAAGG CGTTAAACCC 

BZ10 ACATTAAAGG CGTTAAACCC 

BZ198 ACATTAAAGG CGTTAAACCC 

P20 ATATTAAGGG TGTTAAAACT 

H38 ACATTAAAGG CGTTAAACCC 

Z2 491 ATATTAAGGG TGTTAAAACT 

H41 ACATTAAAGG CGTTAAACCC 

EG329 ACATTAAAGG CGTTAAACCC 

PMC 21 ACATTAAAGG CGTTAAACCC 

EG327 ACATTAAAGG CGTTAAACCC 

Consensus A-ATTAA-GG — GTT AAA- C - 



C4 



C4 



GGTACAACAG CT TC CGATAACGTT GATTTCGTCC 

GGTACAACAG CT TC CGATAACGTC GATTTCGTCC 

GGTACAACAG CT TC CGATAACGTT GATTTCGTCC 

GGCTCAACAA C TGGTCAATC AGAAAATGTC GATTTCGTCC 

GGTACAACAG CT TC CGATAACGTT GATTTCGTCC 

GGCTCAACAA CTGGTCAATC AGAAAATGTC GATTTCGTCC 

GGTACAACAG CT TC CGATAACGTT GATTTCGTCC 

GGTACAACAG CT TC CGATAACGTT GATTTCGTCC 

GGTACAACAG CT TC CGATAACGTT GATTTCGTCC 

GGTACAACAG CT TC CGATAACGTT GATTTCGTCC 

GG — CAACA- CT TC -GA-AA-GT- GATTTCGTCC 

C5 



V4 



770 

GCACTTACGA 
GCACTTACGA 
GCACTTACGA 
GCACTTACGA 
ACACTTAC G A 
GCACTTACGA 
GCACTTACGA 
GCACTTACGA 
GCACTTACGA 
GCACTTACGA 
-CACTTACGA 



HIS 
BZ10 
BZ198 

P20 

H38 
Z2491 

H41 
EG329 
PMC21 
EG327 
Consensus 



771 

CACAGTCGAG 
CACAGTCGAG 
CACAGTCGAG 
CACAGTCGAG 
CACAGTCGAG 
CACAGTCGAG 
CACAGTCGAG 
CACAGTCGAG 
CACAGTCGAG 
CACAGTCGAG 
CACAGTCGAG 



TTCTTGAGCG 
TTCTTGAGCG 
TTCTTGAGCG 
TTCTTGAGCG 
TTCTTGAGCG 
TTCTTGAGCG 
TTCTTGAGCG 
TTCTTGAGCG 
TTCTTGAGCG 
TTCTTGAGCG 
TTCTTGAGCG 



CAGATACGAA 
CAGATACGAA 
CAGATACGAA 
CAGATACGAA 
CAGATACGAA 
CAGATACGAA 
CAGATACGAA 
CAGATACGAA 
CAGATACGAA 
CAGATACGAA 
CAGATACGAA 



AACAACGACT 
AACAACGACT 
AACAACGACT 
AACAACGACT 
AACAACGACT 
AACAACGACT 
AACAACGACT 
AACAACGACT 
AACAACGACT 
AACAACGACT 
AACAACGACT 



GTTAATGTGG 
GTTAATGTGG 
GTTAATGTGG 
GTTAATGTGG 
GTTAATGTGG 
GTTAATGTGG 
GTTAATGTGG 
GTTAATGTGG 
GTTAATGTGG 
GTTAATGTGG 
GTTAATGTGG 



AAAGCAAAGA 
AAAGCAAAGA 
AAAGCAAAGA 
AAAGCAAAGA 
AAAGCAAAGA 
AAAGCAAAGA 
AAAGCAAAGA 
AAAGCAAAGA 
AAAGCAAAGA 
AAAGCAAAGA 
AAAGCAAAGA 



CS 



840 

CAACGGCAAG 
CAACGGCAAG 
CAACGGCAAG 
CAACGGCAAG 
CAACGGCAAG 
CAACGGCAAG 
CAACGGCAAG 
CAACGGCAAG 
CAACGGCAAG 
CAACGGCAAG 
CAACGGCAAG 



HIS 
. BZ10 
BZ198 

P20 

H38 
Z2491 

H41 
EG329 
PMC21 
EG327 
Consensus 



841 

AAAAC CGAAG 
AGAAC CGAAG 
AAAAC CGAAG 
AGAAC CGAAG 
AGAAC CGAAG 
AGAAC CGAAG 
AAAACCGAAG 
AAAACCGAAG 
AAAACCGAAG 
AGAACCGAAG 
A— AAC CGAAG 



TTAAAATCGG 
TTAAAATCGG 
TTAAAATCGG 
TTAAAATCGG 
TTAAAATCGG 
TTAAAATCGG 
TTAAAATCGG 
TTAAAATCGG 
TTAAAATCGG 
TTAAAATCGG 
TTAAAATCGG 



TGCGAAGACT 
TGC GAAGAC T 
TGCGAAGACT 
TGCGAAGACT 
TGCGAAGACT 
TGCGAAGACT 
TGCGAAGACT 
TGCGAAGACT 
TGCGAAGACT 
TGCGAAGACT 
TGCGAAGACT 



TCTGTTATTA 
TCTGTTATTA 
TCTGTTATTA 
TCTGTTATTA 
TCTGTTATTA 
TCTGTTATTA 
TCTGTTATTA 
TCTGTTATTA 
TCTGTTATTA 
TCTGTTATCA 
TCTGTTAT— A 



AAGAAAAAGA 
AAGAAAAAGA 
AAGAAAAAGA 
AAGAAAAAGA 
AAGAAAAAGA 
AAGAAAAAGA 
AAGAAAAAGA 
AAGAAAAAGA 
AAGAAAAAGA 
AAGAAAAAGA 
AAGAAAAAGA 



CGGTAAGTTG 
CGGTAAGTTG 
CGGTAAGTTG 
CGGTAAGTTG 
CGGTAAGTTG 
CGGTAAGTTG 
CGGTAAGTTG 
CGGTAAGTTG 
CGGTAAGTTG 
CGGTAAGTTG 
CGGTAAGTTG 



910 

GTTAC TGGTA 
GTTACTGGTA 
GTT AC TGGTA 
GTTACTGGTA 
GTTACTGGTA 
GTTACTGGTA 
GTTACTGGTA 
GTTACTGGTA 
GTTACTGGTA 
GTTACTGGTA 
GTTACTGGTA 



C5 



FIG. 2D 
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911 980 

HIS AAGGCAAAGA CGAGAATGGT TCTTCTACAG ACGAAGGCGA AGGCTTAGTG ACTGCAAAAG AAGTGATTGA 

BZ10 AAGGCAAAGG CGAGAATGGT TCTTCTACAG ACGAAGGCGA AGGCTTAGTG ACTGCAAAAG - - GTGAXTGA 

BZ198 AAGGCAAAGA CGAGAATGGT TCTTCTACAG ACGAAGGCGA AGGCTTAGTG ACTGCAAAAG AAGTGATTGA 

P20 AAGGCAAAGG CGAGAATGGT TCTTCTACAG ACGAAGGCGA AGGCTTAGTG ACTGCAAAAG AAGTGATTGA 

H38 AAGGCAAAGG CGAGAATGGT TCTTCTACAG ACGAAGGCGA AGGCTTAGTG ACTGCAAAAG AAGTGATTGA 

Z2491 AAGGCAAAGG CGAGAATGGT TCTTCTACAG ACGAAGGCGA AGGCTTAGTG ACTGCAAAAG AAGTGATTGA 

H41 AAGGCAAAGG CGAGAATGGT TCTTCTACAG ACGAAGGCGA AGGCTTAGTG ACTGCAAAAG AAGTGATTGA 

EG32 9 AAGACAAAGG CGAGAATGGT TCTTCTACAG ACGAAGGCGA AGGCTTAGTG ACTGCAAAAG AAGTGATTGA 

PMC 21 AAGACAAAGG CGAGAATGGT TCTTCTACAG ACGAAGGCGA AGGCTTAGTG ACTGCAAAAG AAGTGATTGA 

EG327 AAGACAAAGG C GAGAATG AT TCTTCTACAG ACAAAGGCGA AGGCTTAGTG ACTGCAAAAG AAGTGATTGA 

Consensus AAG-CAAAG- C GAGAATG -T TCT TCTACAG AC-AAGGCGA AGGCTTAGTG ACTGCAAAAG * ^GTGATTGA 

C5 



HIS 
8Z10 
BZ198 

P20 

H38 
Z2491 

H4I 
EG32 9 
PMC 2 1 
EG327 
Consensus 



981 
TGCAGTAAAC 
TGCAGTAAAC 
TGCAGTAAAC 
TGCAGTAAAC 
TGCAGTAAAC 
TGCAGTAAAC 
TGCAGTAAAC 
TGCAGTAAAC 
TGCAGTAAAC 
TGCAGTAAAC 
TGCAGTAAAC 



AAGGCTGGTT 
AAGGCTGGTT 
AAGGCTGGTT 
AAGGCTGGTT 
AAGGCTGGTT 
AAGGCTGGTT 
AAGGCTGGTT 
AAGGCTGGTT 
AAGGCTGGTT 
AAGGCTGGTT 
AAGGCTGGTT 



GGAGAATGAA 
GGAGAATGAA 
GGAGAATGAA 
GGAGAATGAA 
GGAGAATGAA 
GGAGAATGAA 
GGAGAATGAA 
GGAGAATGAA 
GGAGAATGAA 
GGAGAATGAA 
GGAGAATGAA 



AACAACAACC 
AACAACAACC 
AACAACAACC 
AACAACAACC 
AACAACAACC 
AACAACAACC 
AACAACAACC 
AACAACAACC 
AACAACAACC 
AACAACAACC 
AACAACAACC 



GCTAATGGTC 
GCTAATGGTC 
GCTAATGGTC 
GCTAATGGTC 
GCTAATGGTC 
GCTAATGGTC 
GCTAATGGTC 
GCTAATGGTC 
GCTAATGGTC 
GCTAATGGTC 
GCTAATGGTC 



AAACAGGTCA 
AAACAGGTCA 
AAACAGGTCA 
AAACAGGTCA 
AAACAGGTCA 
AAACAGGTCA 
AAACAGGTCA 
AAACAGGTCA 
AAACAGGTCA 
AAACAGGTCA 
AAACAGGTCA 



10S0 
AGC TGACAAG 
AGCTGACAAG 
AGC TGACAAG 
AGCTGACAAG 
AGCTGACAAG 
AGCTGACAAG 
AGCTGACAAG 
AGCTGACAAG 
AGCTGACAAG 
AGCTGACAAG 
AGCTGACAAG 



C5 



HIS 
BZ10 
BZ198 

P20 

H38 
Z2491 

H41 
EG329 
PMC21 
EG327 
Consensus 



HIS 
BZ10 
BZ198 

P20 

H38 
Z2491 

H41 
EG329 
PMC 21 
EG327 
Consensus 



3051 

TTTGAAACCG 
TTTGAAACCG 
TTTGAAACCG 
TTTGAAACCG 
TTTGAAACCG 
TTTGAAACCG 
TTTGAAACCG 
TTTGAAACCG 
TTTGAAACCG 
TTTGAAACCG 
TTTGAAACCG 



TTACATCAGG 
TTACATCAGG 
TTACATCAGG 
TTACATCAGG 
TTACATCAGG 
TTACATCAGG 
TTACATCAGG 
TTACATCAGG 
TTACATCAGG 
TTACATCAGG 
TTACATCAGG 



CACAAAAGTA 
CACAAAAGTA 
CACAAATGTA 
CACAAAAGTA 
CACAAATGTA 
CACAAATGTA 
CACAAAAGTA 
CACAAATGTA 
CACAAATGTA 
CACAAATGTA 
CACAAA-GTA 



ACCTTTGCTA 
ACCTTTGCTA 
ACCTTTGCTA 
ACCTTTGCTA 
ACCTTTGCTA 
ACCTTTGCTA 
ACCTTTGCTA 
ACCTTTGCTA 
ACCTTTGCTA 
ACCTTTGCTA 
ACCTTTGCTA 



GTGGTAATGG 
GTGGTAATGG 
GTGGTAAAGG 
GTGGTAATGG 
GTGGTAAAGG 
GTGGTAAAGG 
GTGGTAATGG 
GTGGTAAAGG 
GTGGTAAAGG 
GTGGTAAAGG 
GTGGTAA-GG 



TACAACTGCG 
TACAACTGCG 
TACAACTGCG 
TACAACTGCG 
TACAACTGCG 
TACAACTGCG 
TACAACTGCG 
TACAACTGCG 
TACAACTGCG 
TACAACTGCG 
TACAACTGCG 



1120 
ACTGTAAGTA 
AC TGTAAGTA 
ACTGTAAGTA 
ACTGTAAGTA 
ACTGTAAGTA 
ACTGTAAGTA 
ACTGTAAGTA 
ACTGTAAGTA 
ACTGTAAGTA 
ACTGTAAGTA 
ACTGTAAGTA 



1321 

AAGATGATCA 
AAGATGATCA 
AAGATGATCA 
AAGATGATCA 
AAGATGATCA 
AAGATGATCA 
AAGATGATCA 
AAGATGATCA 
AAGATGATCA 
AAGATGATCA 
AAGATGATCA 



AGGCAACATC 
AGGCAACATC 
AGGCAACATC 
AGGCAACATC 
AGGCAACATC 
AGGCAACATC 
AGGCAACATC 
AGGCAACATC 
AGGCAACATC 
AGGCAACATC 
AGGCAACATC 



ACTGTTAAGT 
ACTGTTAAGT 
ACTGTTAAGT 
ACTGTTAAGT 
ACTGTTAAGT 
ACTGTTATGT 
ACTGTTAAGT 
ACTGTTATGT 
ACTGTTATGT 
ACTGTTATGT 
AITGTTA-GT 



C5 



ATGATGTAAA 
ATGATGTAAA 
ATGATGTAAA 
ATGATGTAAA 
ATGATGTAAA 
ATGATGTAAA 
ATGATGTAAA 
ATGATGTAAA 
ATGATGTAAA 
ATGATGTAAA 
ATGATGTAAA 



TGTCGGCGAT 
TGTCGGCGAT 
TGTCGGCGAT 
TGTCGGCGAT 
TGTCGGCGAT 
TGTCGGCGAT 
TGTCGGCGAT 
TGTCGGCGAT 
TGTCGGCGAT 
TGTCGGCGAT 
TGTCGGCGAT 



GCCCTAAACG 
GC CCTAAACG 
GCCCTAAACG 
GCCCTAAACG 
GCCCTAAACG 
GCCCTAAACG 
GCCCTAAACG 
GCCCTAAACG 
GCCCTAAACG 
GCCCTAAACG 
GCCCTAAACG 



1190 
TCAATCAGC T 
TCAATCAGCT 
TCAATCAGC T 
TCAATCAGCT 
TCAATCAGCT 
TCAATCAGCT 
TCAATCAGCT 
TCAATCAGCT 
TCAATCAGCT 
TCAATCAGCT 
TCAATCAGCT 



cs 



FIG. 2E 




HIS 
BZ10 
BZ198 

P20 

H38 
Z2491 

H41 
EG32 9 
PMC 2 1 
EG327 
Consensus 
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1191 

GCAAAACAGC 
GCAAAACAGC 
GCAAAACAGC 
GCAAAACAGC 
GCAAAACAGC 
GCAAAACAGC 
GCAAAACAGC 
GCAAAACAGC 
GCAAAACAGC 
GCAAAACAGC 
GCAAAACAGC 



GGTTGGAATT 
GGTTGGAATT 
GGTTGGAATT 
GGTTGGAATT 
GGTTGGAATT 
GGTTGGAATT 
GGTTGGAATT 
GGTTGGAATT 
GGTTGGAATT 
GGTTGGAATT 
GGTTGGAATT 



TGGATTCCAA 
TGGATTCCAA 
TGGATTCCAA 
TGGATTCCAA 
TGGATTCCAA 
TGGATTCCAA 
TGGATTCCAA 
TGGATTCCAA 
TGGATTCCAA 
TGGATTCCAA 
TGGATTCCAA 



AGCGGTTGCA 
AGCGGTTGCA 
AGCGGTTGCA 
AGCGGTTGCA 
AGCGGTTGCA 
AGCGGTTGCA 
AGCGGTTGCA 
AGCGGTTGCA 
AGCGGTTGCA 
AGCGGTTGCA 
AGCGGTTGCA 



GGTTC TTCGG 
GGTTC TTCGG 
GGTTC TTCGG 
GGTTC TTCGG 
GGTTC TTCGG 
GGTTCTTCGG 
GGTTCTTCGG 
GGTTCTTCGG 
GGTTCTTCGG 
GGTTCTTCGG 
GGTTCTTCGG 



GCAAAGTCAT 
GCAAAGTCAT 
GCAAAGTCAT 
GCAAAGTCAT 
GCAAAGTCAT 
GCAAAGTCAT 
GCAAAGTCAT 
GCAAAGTCAT 
GCAAAGTCAT 
GCAAAGTCAT 
GCAAAGTCAT 



C5 



1260 
CAGCGGCAAT 
CAGCGGCAAT 
CAGCGGCAAT 
CAGCGGCAAT 
CAGCGGCAAT 
CAGCGGCAAT 
CAGCGGCAAT 
CAGCGGCAAT 
CAGCGGCAAT 
CAGCGGCAAT 
CAGCGGCAAT 



HIS 
BZ10* 
BZ198 

P20 

H38 
Z2491 

H4 1 
EG 32 9 
PMC 2 1 
EG327 
Consensus 



1261 

GTTTCGCCGA 
GTTTCGCCGA 
GTTTCGCCGA 
GTTTCGCCGA 
GTTTCGCCGA 
GTTTCGCCGA 
GTTTCGCCGA 
GTTTCGCCGA 
GTTTCGCCGA 
GTTTCGCCGA 
GTTTCGCCGA 



GCAAGGGAAA 
GCAAGGGAAA 
GCAAGGGAAA 
GCAAGGGAAA 
GCAAGGGAAA 
GCAAGGGAAA 
GCAAGGGAAA 
GCAAGGGAAA 
GCAAGGGAAA 
GCAAGGGAAA 
GCAAGGGAAA 



GATGGATGAA 
GATGGATGAA 
GATGGATGAA 
GATGGATGAA 
GATGGATGAA 
GATGGATGAA 
GATGGATGAA 
GATGGATGAA 
GATGGATGAA 
GATGGATGAA 
GATGGATGAA 



AC CGTCAACA 
AC CGTCAACA 
AC CGTCAACA 
AC CGTCAACA 
AC CGTCAACA 
ACCGTCAACA 
ACCGTCAACA 
ACCGTCAACA 
ACCGTCAACA 
ACCGTCAACA 
ACCGTCAACA 



TTAATGCCGG 
TTAATGCCGG 
TTAATGCCGG 
TTAATGCCGG 
TTAATGCCGG 
TTAATGCCGG 
TTAATGCCGG 
TTAATGCCGG 
TTAATGCCGG 
TTAATGCCGG 
TTAATGCCGG 



CAACAACATC 
CAACAACATC 
CAACAACATC 
CAACAACATC 
CAACAACATC 
CAACAACATC 
CAACAACATC 
CAACAACATC 
CAACAACATC 
CAACAACATC 
CAACAACATC 



1330 
GAGAT TACC C 
GAGATTACCC 
GAGATTACCC 
GAGATTACCC 
GAGATTACCC 
GAGATTAGCC 
GAGATTACCC 
GAGATTACCC 
GAGATTACCC 
GAGATTACCC 
GAGATTA-CC 



C5 



1331 1400 

HIS GCAACGGCAA AAATATC GAC ATCGCCACTT CGATGACCCC GCAATTTTCC AGCGTTTCGC TCGGCGCGGG 

BZIO GCAACGGCAA AAATATCGAC ATCGCCACTT CGATGACCCC GCAATTTTCC AGCGTTTCGC TCGGCGCGGG 

BZ198 GCAACGGTAA AAATATCGAC ATCGCCACTT CGATGGCGCC GCAGTTTTCC AGCGTTTCGC TCC-GTGCGGG 

P2 0 GCAACGGCAA AAATATCGAC ATCGCCACTT CGATGACCCC GCAATTTTCC AGCGTTTCGC TCGGCGCGGG 

H3 8 GCAACGGTAA AAATATCGAC ATCGCCACTT CGATGACCCC GCAGTTTTCC AGCGTTTCGC TCGGCGCGGG 

Z2 491 GCAACGGTAA AAATATCGAC ATCGCCACTT CGATGGCGCC GCAGTTTTCC AGCGTTTCGC TCGGCGCGGG 

H4 1 GCAACGGCAA AAATATCGAC ATCGCCACTT CGATGACCCC GCAATTTTCC AGCGTTTCGC TCGGCGCGGG 

EG32 9 GCAACGGTAA AAATATCGAC ATCGCCACTT CGATGACCCC GCAGTTTTCC AGCGTTTCGC TCGGCGCGGG 

PMC 2 1 GCAACGGTAA AAATATCGAC ATCGCCACTT CGATGACCCC GCAGTTTTCC AGCGTTTCGC- TCGGCGCGGG 

EG327 GCAACGGCAA AAATATCGAC ATCGCCACTT CGATGACCCC GCAATTTTCC AGCGTTTCGC TCGGCGCGGG 

Consensus GCAACGG-AA AAA TATCGA C ATCGCCACTT CGATG-C-CC GCA-TTTTCC AGCGTTTCGC TCGG-GCGGG 

C5 

1401 1470 

HIS GGC GGATGCG CCCACTTTAA GCGTGGATGA CGAGGGCGCG TTGAATGTCG GCAGCAAGGA TGCCAACAAA 

BZ10 GGC GGATGCG CCCACTTTAA GCGTGGATGA CGAGGGCGCG TTGAATGTCG GCAGCAAGGA TGCCAACAAA 

BZ198 GGC GGATGCG CCCACTTTGA GCGTGGATGA CGAGGGCGCG TTGAATGTCG GCAGCAAGGA TACCAACAAA 

P20 GGC GGATGCG CCCACTTTAA GCGTGGATGA CGAGGGCGCG TTGAATGTCG GCAGCAAGGA TGCCAACAAA 

H38 GGC GGATGCG CCCACTTTGA GCGTGGATGA CAAGGGCGCG TTGAATGTCG GCAGCAAGGA TGCCAACAAA 

Z2491 GGCAGATGCG CCCACTTTAA GCGTGGATGA CGAGGGCGCG TTGAATGTCG GCAGCAAGGA TGCCAACAAA 

H41 GGC GGATGCG CCCACTTTAA GCGTGGATGA CGAGGGCGCG TTGAATGTCG GCAGCAAGGA TGCCAACAAA 

EG32 9 GGC GGATGCG CCCACTTTGA GCGTGGAT. . . GGGGACGCA TTGAATGTCG GCAGCAAGAA GGACAACAAA 

PMC21 GGC GGATGCG CCCACTTTGA GCGTGGAT. . .GGGGACGCA TTGAATGTCG GCAGCAAGAA GGACAACAAA 

EG327 GGC GGATGCG CCCACTTTAA GCGTGGATGA CGAGGGCGCG TTGAATGTCG GCAGCAAGGA TGCCAACAAA 

Consensus GGC - G ATGCG CCCACTTT-A GCGTGGAT-- GG-CGC- TTGAATGTCG GCAGCAAG— A CAACAAA 

C5 
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570 



HIS 
BZIO 
BZ198 

P20 

H38 
Z2491 

H41 
EG329 
PMC21 
EG327 
Consensus 



1471 

CCCGTCCGCA 
CCCGTCCGCA 
CCCGTCCGCA 
CCCGTCCGCA 
CCCGTCCGCA 
CCCGTCCGCA 
CCCGTCCGCA 
CCCGTCCGCA 
CCCGTCCGCA 
CCCGTCCGCA 
CCCGTCCGCA 



TTACCAATGT 
TTACCAATGT 
TTACCAATGT 
TTACCAATGT 
TTACCAATGT 
TTACCAATGT 
TTACCAATGT 
TTACCAATGT 
TTACCAATGT 
TTACCAATGT 
TTACCAATGT 



CGCCCCGGGC 
CGCCCCGGGC 
CGCCCCGGGC 
CGCCCCGGGC 
CGCCCCGGGC 
CGCCCCGGGC 
CGCCCCGGGC 
CGCCCCGGGC 
CGCCCCGGGC 
CGCCCCGGGC 
CGCCCCGGGC 



GTTAAAGAGG 
GTTAAAGAGG 
GTTAAAGAGG 
GTTAAAGAGG 
GTTAAAGAGG 
GTTAAAGAGG 
GTTAAAGAGG 
GTTAAAGAGG 
GTTAAAGAGG 
GTTAAAGAGG 
GTTAAAGAGG 



GGGATGTTAC 
GGGATGTTAC 
GGGATGTTAC 
GGGATGTTAC 
GGGATGTTAC 
GGGATGTTAC 
GGGATGTTAC 
GGGATGTTAC 
GGGATGTTAC 
GGGATGTTAC 
GGGATGTTAC 



AAAC GTCGCA 
AAACGTCGCA 
AAAC GTCGCA 
AAACGTCGCA 
AAACGTCGCA 
AAACGTCGCA 
AAACGTCGCG 
AAACGTCGCA 
AAACGTCGCA 
AAACGTCGCA 
AAACGTCGC— 



C5 



1540 
CAa.CTTAAAG 
CAACTTAAAG 
CAACTTAAAG 
CAACTTAAAG 
CAACTTAAAG 
CAACTTAAAG 
CAACTTAAAG 
CAACTTAAAG 
CAACTTAAAG 
CAACTTAAAG 
CAACTTAAAG 



HIS 
BZ10 
BZ198 

P20 

H38 
Z2491 

H41 
EG329 
PMC 21 
EG327 
Consensus 



1541 

GTGTGGCGCA 
GTGTGGCGCA 
GCGTGGCGCA 
GTGTGGCGCA 
GCGTGGCGCA 
GCGTGGCGCA 
GTGTGGCGCA 
GCGTGGCGCA 
GCGTGGCGCA 
GCGTGGCGCA 
G - GTGGCGC A 



AAACTTGAAC 
AAACTTGAAC 
AAACTTGAAC 
AAACTTGAAC 
AAACTTGAAC 
AAACTTGAAC 
AAACTTGAAC 
AAACTTGAAC 
AAACTTGAAC 
AAACTTGAAC 
AAACTTGAAC 



AACCGCATCG 
AACCGCATCG 
AACCGCATCG 
AACCGCATCG 
AACCGCATCG 
AACCGCATCG 
AACCGCATCG 
AACCGCATCG 
AACCGCATCG 
AACCACATCG 
AACC-CATCG 



ACAATGTGGA 
ACAATGTGGA 
ACAATGTGGA 
ACAATGTGAA 
ACAATGTGGA 
ACAATGTGGA 
ACAATGTGAA 
ACAATGTGGA 
ACAATGTGGA 
ACAATGTGGA 
ACAATGTG - A 



CGGCAACGCG 
CGGCAACGCG 
CGGCAACGCG 
CGGCAACGCG 
CGGCAACGCG 
CGGCAACGCG 
CGGCAACGCG 
CGGCAACGCG 
CGGCAACGCG 
CGGCAACGCG 
CGGCAACGCG 



CGCGCGGGTA 
CGCGCGGGTA 
CGTGCGGGCA 
CGCGCGGGTA 
CGTGCGGGCA 
CGTGCGGGCA 
CGTGCGGGCA 
CGTGCGGGCA 
CGTGCGGGCA 
CGTGCGGGCA 
CG-GCGGG-A 



1610 
TCGCCCAAGC 
TCGCCCAAGC 
TCGCCCAAGC 
TCGCCCAAGC 
TCGCCCAAGC 
TCGCCCAAGC 
TCGCCCAAGC 
TCGCCCAAGC 
TCGCCCAAGC 
TCGCCCAAGC . 
TCGCCCAAGC 



C5 



HIS GATTGCAACC GCAGGTTTGG CTCAGGCGTA TTTGCCCGGC AAGAGTATGA TGGCGATCGG CGGCGGTACT 

BZ10 GATTGCAACC GCAGGTTTGG C TC AGGCCT A TTTGCCCGGC AAGAGTATGA TGGCGATCGG CGGCGGTACT 

BZ198 GATTGCAACC GCAGGTC TAG TTCAGGCGTA TCTGCCCGGC AAGAGTATGA TGGCGATCGG CGGCGACACT 

P20 GATTGCAACC GCAGGTTTGG CTC AGGCCT A TTTGCCCGGC AAGAGTATGA TGGCGATCGG CGGCGGTACT 

H38 GATTGCAACC GCAGGTCTGG TTCAGGCGTA TCTGCCCGGC AAGAGTATGA TGGCGATCGG CGGCGGCACT 

Z24 91 GATTGCAACC GCAGGTCTGG TTCAGGCGTA TCTGCCCGGC AAGAGTATGA TGGCGATCGG CGGCGGCACT 

H41 GATTGCAACC GCAGGTCTGG TTCAGGCGTA TCTGCCCGGC AAGAGTATGA TGGCGATCGG CGGCGGCACT 

EG329 GATTGCAACC GCAGGTCTGG TTCAGGCGTA TTTGCCCGGC AAGAGTATGA TGGCGATCGG CGGCGGCACT 

PMC21 GATTGCAACC GCAGGTCTGG TTCAGGCGTA TTTGCCCGGC AAGAGTATGA TGGCGATCGG CGGCGGCACT 

EG327 GATTGCAACC GCAGGTCTGG TTCAGGCGTA TCTGCCCGGC AAGAGTATGA TGGCGATCGG CGGCGGCACT 

Consensus GATTGCAACC GCAGG T-T-G -TCAGGC-TA T-TGCCCGGC AAGAGTATGA TGGCGATCGG CGGCG ACT 

C5 



FIG. 2G 
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FIG. 3A 
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FIG. 3B 



16/31 




FIG. 4A 
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1 MNKIYRIIWN SALNAWVWS ELTRNHTKRA SATVKTAVLA TLLFATVQAS 

51 ANNETDLTSV GTEKLSFSAN GNKVNITSDT KGLN FAKETA GTNGDTTVHL 

101 NGIGSTLTDT LLNTGATTNV TNDNVTDDEK KRAASVKDVL NAGWNIKGVK 

151 PGTTASDNVD FVRTYDTVEF LSADTKTTTV NVESKDNGKK TEVKIGAKTS 

2 01 VIKEKDGKLV TGKDKGENGS STDEGEGLVT AKEVIDAVNK AGWRMKTTTA 

251 NGQTGQADKF ETVTSGTNVT FASGKGTTAT VSKDDQGNIT VMYDVNVGDA 

301 LNVNQLQNSG WNLDSKAVAG SSGKVISGNV SPSKGKMDET VNINAGHWIE 

351 ITRNGKNIDI ATSMTPQFSS VSLGAGADAP TLSVDGDALN VGSKKDNKPV 

4 01 RITNVAPGVK EGDVTNVAQL KGVAQNLNNR I DNVDGNARA GIAQAIATAG 

451 LVQAYLPGKS MMAIGGGTYR GEAGYAIGYS SISDGGNWII KGTAS GNSRG 

501 HFGASASVGY QW* 



FIG. 5A 



1 ATGAACAAAA TAT AC C GC AT CAT T T GGAAT AGTGCCCTCA ATGCATGGGT 

51 CGTCGTATCC GAGCTCACAC GCAACCACAC CAAACGCGCC TCCGCAACCG 

101 TGAAGACCGC CGTATTGGCG ACTCTGTTGT TTGCAACGGT TCAGGCAAGT 

151 GCTAACAATG AAACAGATCT GACCAGTGTT GGAACTGAAA AATTATCGTT 

201 TAGC GCAAAC GGCAATAAAG TCAACATCAC AAGCGACACC AAAGGCTTGA 

251 ATTTTGCGAA AGAAACGGCT GGGACGAACG GCGACACCAC GGTTCATCTG 

301 AACGGTATTG GTTCGACTTT GACCGATACG CTGCTGAATA CCGGAGCGAC 

yj 351 CACAAACGTA ACCAACGACA ACGTTACCGA TGACGAGAAA AAACGTGCGG 

D 4 01 CAAGCGTTAA AGACGTATTA AACGCTGGCT GGAACATTAA AGGCGTTAAA 

'4 51 CCCGGTACAA CAGCTTCCGA TAACGTTGAT TTCGTCCGCA CTTACGACAC 

501 AGTCGAGTTC TTGAGCGCAG ATACGAAAAC AACGACTGTT AATGTGGAAA 

551 GCAAAGACAA CGGCAAGAAA ACCGAAGTTA AAATCGGTGC GAAGACTTCT 

601 GTTATTAAAG AAAAAG AC G G TAAGTTGGTT ACTGGTAAAG ACAAAGGCGA 

651 GAATGGTTCT T CTAC AG AC G AAGGCGAAGG CTTAGTGACT GCAAAAGAAG 

7 01 TGATTGATGC AGTAAACAAG GCTGGTTGGA GAAT G AAAAC AACAACCGCT 

751' AATGGTCAAA CAGGTCAAGC TGACAAGTTT GAAACCGTTA CATCAGGCAC 

801 AAATGTAACC TTTGCTAGTG GTAAAGGTAC AACTGCGACT GTAAGTAAAG 

851 ATGATCAAGG CAACATCACT GTTATGTATG ATGTAAATGT CGGCGATGCC 

901 CTAAACGTCA ATCAGCTGCA AAACAGCGGT TGGAATTTGG AT T C CAAAGC 

951 GGTTGCAGGT TCTTCGGGCA AAGT CAT CAG CGGCAATGTT TCGCCGAGCA 

1001 AGGGAAAGAT GGATGAAACC GTCAACATTA ATGCCGGCAA CAACATCGAG 

1051 ATTACCCGCA ACGGTAAAAA TAT C G AC AT C GCCACTTCGA TGACCCCGCA 

11*01 GTTTTCCAGC GTTTCGCTCG GCGCGGGGGC GGATGCGCCC ACTTTGAGCG 

1151 TGGATGGGGA C GC ATT GAAT GTCGGCAGCA AGAAGGACAA CAAACCCGTC 

1201 CGCATTACCA ATGTCGCCCC GGGCGTTAAA GAGGGGGATG TTACAAACGT 

1251 CGCACAACTT AAAGGCGTGG CGCAAAACTT GAACAACCGC ATCGACAATG 

1301 TGGACGGCAA CGCGCGTGCG GGCATCGCCC AAGCGATTGC AACCGCAGGT 

1351 CTGGTTCAGG CGTATTTGCC CGGCAAGAGT AT GAT GG C GA TCGGCGGCGG 

14 01 CACTTATCGC GGCGAAGCCG GTTACGCCAT CGGCTACTCC AGTATTTCCG 

14 51 ACGGCGGAAA TT GGATTAT C AAAGGCACGG CTTCCGGCAA TTCGCGCGGC 

1501 CATTTCGGTG CTTCCGCATC TGTCGGTTAT CAGTGGTAA 



FIG. 5B 
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i 

51 
101 
151 
201 
251 
301 
351 
401 
451 
501 



MNKI YRIIWN 
ATDETGLINV 
NGIGSTLTDM 
PGTTAS DNVD 
VIKEKDGKLV 
NGQTGQADKF 
LNVNQLQNSG 
ITRNGKNIDI 
VRITNVAPGV 
GLVQAYLPGK 
GHFGASASVG 



SALNAWVAVS 
ETEKLSFGAN 
LLNTGATTNV 
FVRTYDTVEF 
TGKGKGENGS 
ETVTSGTKVT 
WNLDSKAVAG 
ATSMTPQFSS 
KEGDVTNVAQ 
SMMAI GGGT Y 
YQW* 



ELTRNHTKRA 
GKKVNIISDT 
TNDNVTDDEK 
LSADTKTTTV 
STDEGEGLVT 
FASGNGTTAT 
SSGKVISGNV 
VSLGAGADAP 
LKGVAQNLNN 
LGEAGYAIGY 



SATVKTAVLA 
KGLN FAKETA 
KRAASVKDVL 
NVESKDNGKK 
AKEVIDAVNK 
VSKDDQGNIT 
SPSKGKMDET 
TLSVDDEGAL 
RIDNVNGNAR 
SSISAGGNWI 



TLLFATVQAN 
GTNGDTTVHL 
NAGWNlKGVK 
TEVKIGAKTS 
AGWRMKTTTA 
VKYDVNVGDA 
VNINAGNNIE 
NVGSKDANKP 
AGIAQAIATA 
IKGTASGNSR 



FIG. 6A 



1 

51 
101 
151 
201 
251 
301 
351 
401 
451 
501 
551 
601 
651 
701 
751 
801 
851 
901 
951 
1001 
1051 
1101 . 
1151 
1201 
1251 
1301 
1351 
1401 
1451 
1501 



ATGAACAAAA 
CGCCGTATCC 
TGAAGACCGC 
GCTACCGATG 
TGGCGCAAAC 
ATTTCGCGAA 
AACGGTATCG 
CACAAACGTA 
CAAGCGTTAA 
CCCGGTACAA 
AGTCGAGTTC 
GCAAAGACAA 
GTTATTAAAG 
GAATGGTTCT 
T GAT T GAT GC 
AATGGTCAAA 
AAAAGTAACC 
ATGATCAAGG 
CTAAACGTCA 
GGTTGCAGGT 
AGGGAAAGAT 
ATTACCCGCA 
ATTTTCCAGC 
T G GAT G AC G A 
GTCCGCATTA 
CGTCGCGCAA 
ATGTGAACGG 
GGTCTGGTTC 
CGGCACTTAT 
CCGCCGGCGG 
GGCCATTTCG 



TATACCGCAT 

GAGCTCACAC 

CGTATTGGCG 

AAACAGGCCT 

GGCAAGAAAG 

AGAAACGGCT 

GTTCGACTTT 

ACCAACGACA 

AGAC GTATTA 

CAGCTTCCGA 

TTGAGCGCAG 

CGGCAAGAAA 

AAAAAG AC G G 

TCTACAGACG 

AGTAAACAAG 

CAGGTCAAGC 

TTTGCTAGTG 

CAACATCACT 

ATCAGCTGCA 

TCTTCGGGCA 

GGATGAAACC 

ACGGCAAAAA 

GTTTCGCTCG . 

GGGCGCGTTG 

CCAATGTCGC 

CTTAAAGGTG 

CAACGCGCGT 

AGGCGTATCT 

CTCGGCGAAG 

AAATT GGATT 

GTGCTTCCGC 



CATTTGGAAT 

GCAACCACAC 

ACACTGTTGT 

GATCAATGTT 

T C AAC AT CAT 

GGGACGAACG 

GACCGATATG 

ACGTTACCGA 

AACGCAGGCT 

TAACGTTGAT 

ATACGAAAAC 

AC C GAAGTT A 

TAAGTTGGTT 

AAGGCGAAGG 

GCTGGTTGGA 

TGACAAGTTT 

GTAAT GGTAC 

GTTAAGTATG 

AAACAGC GGT 

AAG T CAT C AG 

GT CAACATT A 

TAT CGAC AT C 

GCGCGGGGGC- 

AATGTCGGCA 

CCCGGGCGTT 

TGGCGCAAAA 

GCGGGCATCG 

GCCCGGCAAG 

GCGGTTATGC 

AT CAAAGGCA 

ATCTGTCGGT 



AGTGCCCTCA 
CAAACGCGCC 
TTGCAACGGT 
GAAACT GAAA 
AAGCGACACC 
GCGACACCAC 
CTGCTGAATA 
TGACGAGAAA 
GGAACATTAA 
TTCGTCCGCA 
AACGACTGTT 
AAATCGGTGC 
ACT GGT AAAG 
CTTAGTGACT 
G AAT GAAAAC 
GAAACCGTTA 
AACTGCGACT 
ATGTAAATGT 
TGGAATTTGG 
CGGCAATGTT 
ATGCCGGCAA 
GCCACTTCGA 
GGATGCGCCC 
GCAAGGATGC 
AAAGAGGGGG 
CTT GAACAAC 
CCCAAGCGAT 
AGTATGATGG 
CATCGGCTAC 
CGGCTTCCGG 
TATCAGTGGT 



ATGCCTGGGT 
TCCGCAACCG 
TCAGGCGAAT 
AATTATCGTT 
AAAGGCTTGA 
GGTTCATCTG 
CCGGAGCGAC 
AAACGTGCGG 
AGGCGTTAAA 
CTT AC GACAC 
AATGTGGAAA 
GAAGACTTCT 
GCAAAGGCGA 
GCAAAAGAAG 
AACAACCGCT 
CATCAGGCAC 
GTAAGTAAAG 
CGGCGATGCC 
ATTCCAAAGC 
TCGCCGAGCA 
CAACATCGAG 
TGACCCCGCA 
ACTTTAAGCG 
CAACAAACCC 
AT GTTACAAA 
CGC AT CGACA 
TGCAACCGCA 
CGATCGGCGG 
TCAAGCATTT 
CAATTCGCGC 
AA 



FIG. 6B 
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1 MNKIYRI IWN SALNAWVWS ELTRNHTKRA SATVKTAVLA TLLFATVQAS 

51 ANNVDFVRTY DTVEFLSADT KTTTVNVESK DNGKKTEVKI GAKTSVIKEK 

101 DGKLVTGKDK GENGSSTDEG EGLVTAKEVI DAVNKAGWRM KTTTANGQTG 

151 QADKFETVTS GTNVTFASGK GTTATVSKDD QGNITVMYDV NVGDALNVNQ 

201 LQNSGWNLDS KAVAGSSGKV ISGNVSPSKG KMDETVNINA GNNIEITRNG 

251 KNIDIATSMT PQFSSVSLGA GADAPTLSVD GDALNVGSKK DNKPVRI TNV 

301 APGVKEGDVT NVAQLKGVAQ NLNNRI DNVD GNARAG I AQA I AT AG L VQ A Y 

351 LPGKSMMAIG GGTYRGEAGY AIGYSSISDG GNWIIKGTAS GNSRGHFGAS 

4 01 ASVGYQW* 



fl FIG. 7 A 



UJ 



1 ATGAACAAAA TAT AC C G CAT CATTT GGAAT AGTGCCCTCA ATGCATGGGT 

51 CGTCGTATCC GAGCTCACAC GCAACCACAC CAAACGCGCC TCCGCAACCG 

101 TGAAGACCGC CGTATTGGCG ACTCTGTTGT TTGCAACGGT TCAGGCAAGT 

151 GCTAACAAC G TTGATTTCGT CCGCACTTAC GACACAGTCG AGTTCTTGAG 

201 CGCAGATACG AAAACAACGA CTGTTAATGT G GAAAGCAAA GACAAC GGC A 

251 AGAAAACCGA AGTTAAAATC GGTGCGAAGA CTTCTGTTAT TAAAGAAAAA 

301 GAC GGTAAGT TGGTTACTGG TAAAGACAAA GGCGAGAATG GTTCTTCTAC 

351 AGACGAAGGC GAAGGCTTAG TGACTGCAAA AGAAGT GATT GAT GCAGTAA 

4 01 ACAAGGCTGG TTGGAGAATG AAAACAACAA CCGCTAATGG TCAAACAGGT 

451 CAAGCTGACA AGTTTGAAAC C GT T AC AT C A GGCACAAATG TAACCTTTGC 

501 TAGT GGTAAA GGTACAACTG CGACTGTAAG T AAAGAT GAT CAAGGCAACA 

551 TCACTGTTAT GTATGATGTA AATGTCGGCG ATGCCCTAAA CGTCAATCAG 

601 CTGCAAAACA GCGGTTGGAA TTTGGATTCC AAAGCGGTTG CAGGTTCTTC 

651 GGGCAAAGTC ATCAGCGGCA ATGTTTCGCC GAGCAAGGGA AAGATGGATG 

701 AAACCGTCAA CATTAAT GCC GGCAACAACA T C GAGATT AC CCGCAACGGT 

751 AAAAAT AT CG ACATCGCCAC TT C GAT GAC C CCGCAGTTTT CCAGCGTTTC 

801 GCTCGGCGCG GGGGCGGATG CGCCCACTTT GAG C GT G GAT GGGGACGCAT 

851 TGAATGTCGG CAGCAAGAAG GACAACAAAC CCGTCCGCAT TACCAATGTC 

901 GCCCCGGGCG TTAAAGAGGG G G AT GT T AC A AAC GT C G C AC AACTTAAAGG 

951 CGTGGCGCAA AACTTGAACA ACCGCATCGA CAATGTGGAC GGCAACGCGC 

1001 GTGCGGGCAT CGCCCAAGCG ATTGCAACCG CAGGTCTGGT TCAGGCGTAT 

1051 TTGCCCGGCA AGAGTATGAT GGCGATCGGC GGCGGCACTT ATCGCGGCGA 

1101 AGCCGGTTAC GCCATCGGCT ACTCCAGTAT TTCCGACGGC GGAAATTGGA 

1151 TTAT CAAAGG CACGGCTTCC GGCAATTCGC GCGGCCATTT CGGTGCTTCC 

1201 GCATCTGTCG GTTATCAGTG GTAA 



FIG. 7B 



22/31 



# 



1 MNKIYRIIWN SALNAWVWS .ELTRNHTKRA SATVKTAVLA TLLFATVQAS 

51 ANRAASVKDV LNAGWNIKGV KPGTTASDNV DFVRTYDTVE FLSADTKTTT 

101 VNVESKDNGK KTEVKIGAKT SVIKEKDGKL VTGKDKGENG SSTDEGEGLV 

151 TAKE VI DAVN KAGWRMKTTT ANGQTGQADK FETVTSGTNV TFASGKGTTA 

201 TVSKDDQGNI TVMYDVNVGD ALNVNQLQNS GWNLDSKAVA GSSGKVI SGN 

251 VSPSKGKMDE TVNINAGNNI .EITRNGKNID IATSMTPQFS SVSLGAGADA 

301 PTLSVDGDAL NVGSKKDNKP VRITNVAPGV KEGDVTNVAQ LKGVAQNLNN 

351 RIDNVDGNAR AGIAQAIATA GLVQAYLPGK SMMAIGGGTY RGEAGYAIGY 

4 01 SSISDGGNWI I KGTAS GNSR GHFGASASVG YQW* 



FIG. 8A 



1 ATGAACAAAA TAT AC C GC AT CATTTGGAAT AGTGCCCTCA ATGCATGGGT 

51 CGTCGTATCC GAGCTCACAC GCAACCACAC CAAACGCGCC TCCGCAACCG 

101 TGAAGACCGC CGTATTGGCG ACTCTGTTGT TTGCAACGGT TCAGGCAAGT 

151 GCTAACCGTG CGGCAAGCGT TAAAGACGTA TTAAACGCTG GCTGGAACAT 

2 01 TAAAGGCGTT AAACCCGGTA CAACAGCTTC CGATAACGTT GATTTCGTCC 

251 GCACTTACGA CACAGTCGAG TTCTTGAGCG CAGATACGAA AACAACGACT 

301 GTTAATGTGG AAAGCAAAGA CAACGGCAAG AAAACCGAAG TTAAAATCGG 

351 TGCGAAGACT TCTGTTATTA AAGAAAAAGA CGGTAAGTTG GTTACTGGTA 

4 01 AAGACAAAGG CGAGAATGGT TCTTCTACAG ACGAAGGCGA AGGCTTAGTG 

451 ACT G C AAAAG AAGTGATTGA TGCAGTAAAC AAGGCTGGTT GGAGAATGAA 

501 AACAACAACC GCTAATGGTC AAACAGGTCA AGCT GACAAG TTTGAAACCG 

551 TTACATCAGG CACAAATGTA ACCTTTGCTA GTGGTAAAGG TACAACTGCG 

601 ACT GTAAGTA AAGAT GAT C A AG GCAACAT C ACTGTTATGT AT GAT GT AAA 

651 TGTCGGCGAT GCCCTAAACG TCAATCAGCT GCAAAACAGC GGTTGGAATT 

7 01 TGGATTCCAA AGCGGTTGCA GGTTCTTCGG GCAAAGT CAT CAGCGGCAAT 
751 GTTTCGCCGA G CAAGG GAAA GATGGATGAA ACCGTCAACA TTAATGCCGG 

8 01 CAACAACATC GAGATT AC C C GCAACGGTAA AAAT AT C G AC ATCGCCACTT 
851 CGATGACCCC GCAGTTTTCC AGCGTTTCGC TCGGCGCGGG GGCGGATGCG 
901 CCCACTTTGA GCGTGGATGG GGACGCATTG AATGTCGGCA GCAAGAAGGA 
951 CAACAAACCC GTCCGCATTA CCAATGTCGC CCCGGGCGTT AAAGAGGGGG 

1001 AT GT T AC AAA CGTCGCACAA CTTAAAGGCG TGGCGCAAAA CTTGAACAAC 

1051 CGCATCGACA ATGTGGACGG CAACGCGCGT GCGGGCATCG CCCAAGCGAT 

1101 TGCAACCGCA GGTCTGGTTC AGGCGTATTT GCCCGGCAAG AGTATGATGG 

1151 CGATCGGCGG CGGCACTTAT CGCGGCGAAG CCGGTTACGC CATCGGCTAC 

1201 TCCAGTATTT CCGACGGCGG AAATTGGATT AT CAAAG GC A CGGCTTCCGG 

1251 CAATTCGCGC GGCCATTTCG GTGCTTCCGC ATCTGTCGGT TATCAGTGGT 

1301 AA 
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1 MNKIYRIIWN SALNAWVWS ELTRNHTKRA SATVKTAVLA T L L FAT VQ A S 

51 ANTLKAGDNL KIKQFTYSLK KDLTDLTSVG -TEKLSFSANG NKVNITSDTK 

101 GLNFAKETAG TNGDTTVHLN GIGSTLTDRA ASVKDVLNAG- WNI KGVKNVD 

151 FVRTYDTVEF LSADTKTTTV NVESKDNGKK TEVKIGAKTS VIKEKDGKLV 

201 TGKDKGENGS STDEGEGLVT AKEVI DAVNK AGWRMKTTTA NGQTGQADKF - 

251 ETVTS GTNVT FASGKGTTAT VSKDDQGNIT VMYDVNVGDA LNVNQLQNSG 

301 WNLDS KAVAG SSGKVISGNV SPSKGKMDET VNINAGNNIE ITRNGKNIDI 

351'* ATSMTPQFSS VSLGAGADAP TLSVDGDALN VGS KKDNKPV RITNVAPGVK 

4 01 EGDVTNVAQL KGVAQNLNNR IDNVDGNARA GIAQAIATAG LVQAYLPGKS 

451 MMAIGGGTYR GEAGYAIGYS SISDGGNWII KGTASGNSRG HFGASASVGY 

501 QW* 



FIG. 9 A 



1 ATGAACAAAA T ATACC G CAT CATTTGGAAT AGTGCCCTCA ATGCATGGGT 

51 CGTCGTATCC GAGCTCACAC GCAACCACAC CAAACGCGCC TCCGCAACCG 

101 TGAAGACCGC CGTATTGGCG ACTCTGTTGT TTGCAACGGT TCAGGCAAGT 

151 GCTAACACCC TCAAAGCCGG CGACAACCTG AAAATCAAAC AATTCACCTA 

2 01 CTCGCTGAAA AAAGACCTCA CAGATCTGAC CAGTGTTGGA ACT GAAAAAT 

251 TAT CGTTTAG CGCAAACGGC AATAAAGTCA AC AT C ACAAG CGAC AC CAAA 

301 GGCTTGAATT TTGCGAAAGA AACGGCTGGG ACGAACGGCG ACACCAC GGT 

351 T CAT CT GAAC GGT ATT GGTT CGACTTTGAC CGATCGTGCG GCAAGCGTTA 

4 01 AAGACGTATT AAACGCTGGC T G GAAC AT T A AAGGCGTTAA AAACGTTGAT 

451 TTCGTCCGCA CTTACGACAC AGTCGAGTTC TTGAGCGCAG ATACGAAAAC 

501 AACGACTGTT AATGTGGAAA GCAAAGACAA CGGCAAGAAA ACCGAAGTTA 

551 AAATCGGTGC GAAGACTTCT GTTATTAAAG AAAAAGACGG TAAGTTGGTT 

601 ACT GGTAAAG ACAAAGGCGA GAATGGTTCT TCTACAGACG AAGGCGAAGG 

651 CTTAGT GACT GCAAAAGAAG T GATT GAT GC AGTAAACAAG GCTGGTTGGA 

701 GAATGAAAAC AACAACCGCT AAT GGT CAAA CAGGT CAAGC TGACAAGTTT 

751 GAAAC CGTTA CATCAGGCAC AAATGTAACC TTTGCTAGTG GTAAAGGTAC 

801 AACTGCGACT GTAAGTAAAG ATGATCAAGG CAACATCACT GTTATGTATG 

851 AT GTAAAT GT CGGCGATGCC CTAAACGTCA AT CAGCT GCA AAACAGCGGT 

901 TGGAATTTGG ATTCCAAAGC GGTTGCAGGT TCTTCGGGCA AAGT CAT CAG 

951 CGGCAATGTT TCGCCGAGCA AGGGAAAGAT -GGATGAAACC GT CAACATT A 

1001 ATGCCGGCAA CAACATCGAG ATTACCCGCA AC G GTAAAAA TATCGACATC 

1051 GCCACTTCGA TGACCCCGCA GTTTTCCAGC GTTTCGCTCG GCGCGGGGGC 

1101 GGATGCGCCC ACTTTGAGCG TGGATGGGGA CGCATTGAAT GTCGGCAGCA 

1151 AGAAGGACAA CAAACCCGTC C G C ATT AC CA ATGTCGCCCC .GGGCGTTAAA 

1201 GAGGGGGATG TTACAAACGT CGCACAACTT AAAGGCGTGG CGCAAAACTT 

1251 GAACAACCGC AT CGACAAT G TGGACGGCAA CGCGCGTGCG GGCATCGCCC 

1301 AAGCGATTGC AACCGCAGGT CT GGTT CAG G CGTATTTGCC CGGCAAGAGT 

1351 ATGATGGCGA TCGGCGGCGG CACTTATCGC GGCGAAGCCG GTT AC G C CAT 

14 01 CGGCTACTCC AGTATTTCCG ACGGCGGAAA TTGGATTATC AAAGGCACGG 

14 51 CTTCCGGCAA TTCGCGCGGC CATTTCGGTG CTTCCGCATC TGTCGGTTAT 
1501 CAGTGGTAA 
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1 50 

MNKIYRIIWN S ALNAWVAVS ELTRNHTKRA SATVKTAVLA TLI.FATVOAN 

MNKIYRIIWN SALNAWVWS DLTR NH TKRA SATVNTAVLA TLLFATVOA^ 

MNKIYRIIWN SALNAWVAVS ELTRNHTKRA SATVKTAVLA TLLFATVQAN 

PMC21Bgldel MNKIYRIIWN SALNAWVWS ELTRNHTKRA SATVKTAVLA TLLFATVQAS 

PMC21C1CS MNKIYRIIWN- SALNAWVWS ELTRNHTKRA SATVKTAVLA TLLFATVQAS 

CI 



H41 

PMC21 
H41Studel 



51 100 
H41 ATDED. .EEz, ELESVQRS.V VGSIQASMEG SVELET ... I SLSMTND SKE 
PMC21 ANNEEQEEYL YLHPVQRTVA VLI VNSDKEG AGEKEKVEE N SDWAVYFNEK 

H41Studel ATDE ~ " 

PMC21Bgldel ANNE '.WWW W 

PMC21C1C5 AN ; 

VI 

101 150 
H41 FVDPYIWTL KAGDNLKIKO N.TNENTNAS SFTYSLKKDL TGLINVETEK 

PMC21 GVLTAREITL KAGDNLKIKO NGTN FTYSLKKDL TDLTSVGTEK 

H4lStudel TGLINVETEK 

PMC21Bgldel TDLTSVGTEK 

PMC21C1C5 1 

VI C2 V2 C3 

151 200 
LSFGANGKKV NIISDTKGLN FAKETAGTNG DTTVHLNGIG STLTDMI.I.NT 



LSFSAHGNKV NITSDTKGLN FAKETAGTNG DTTVHLNGIG STLTDTLLNT 



H41 
PMC 21 

H41Studel LSFGANGKKV NIISDTKGLN FAKETAGTNG DTTVHLNGIG STLTDMLLNT 
PMC21Bgldel LSFSANGNKV NITSDTKGLN FAKETAGTNG DTTVHLNGIG STLTDTLLNT 
PMC21C1C5 

C3 V3 

201 250 
H41 GATTNVTNDN VTDDEKKRAA SVKDVLNAGW NIKGVKPGTT ASDNVDFVRT 
PMC 2 1 GATTNVTNDN VTDDEKKRAA SVKDVLNAGW NIKGVKPGTT ASDNVDFVRT 
H41Studel GATTNVTNDN VTDDEKKRAA SVKDVLNAGW NIKGVKPGTT ASDNVDFVRT 
PMC21Bgldel GATTNVTNDN VTDDEKKRAA SVKDVLNAGW NIKGVKPGTT ASDNVDFVRT 

PMC21C1C5 \ _ . ' NVDFVRT 

V3 C4 V4 C5 

251 300 

H41 YDTVEFLSAD TKTTTVNVES KDNGKKTEVK IGAKTSVIKE KDGKLVTGKG 

PMC 2 1 YDTVEFLSAD TKTTTVNVES KDNGKKTEVK IGAKTSVIKE KPGKLVTGKD 

H41Studel YDTVEFLSAD TKTTTVNVES KDNGKKTEVK IGAKTSVIKE KDGKLVTGKG 

PMC21Bgldel YDTVEFLSAD TKTTTVNVES KDNGKKTEVK IGAKTSVIKE KDGKLVTGKD 

PMC21C1C5 YDTVEFLSAD TKTTTVNVES KDNGKKTEVK IGAKTSVIKE KDGKLVTGKD 

CS 



H41 

PMC21 
H41Studel 
PMC21Bgldel 
PMC21C1C5 



301 350 
KGENGSSTDE GEG LVTAKEV I DAVNKAGWR MKTTTANGOT GOADKFETVT 
kgengsstde: GEGLVTAKEV IDAVNKAGWR MKTTTANGOT GQADKFETVT 
KGENGSSTCZ GEGLVTAKEV IDAVNKAGWR MKT TT ANGQT GQADKFETVT 
KGENGSSTC£ GEGLVTAKEV IDAVNKAGWR MKTTTANGQT GQADKFETVT 
KGENGSSTDE GEGLVTAKEV IDAVNKAGWR MKTTTANGQT GQADKFETVT 

C5 



H41 
PMC21 



351 

SGTKVTFAS 



400 



NGTTATVSKD DQGN I TVKYD VNVGDALNVN QLQNSGWNLD 



SGTKVTFAS G KGTTATVSKD DQGNITVMYD VNVGPALNVN QLQNSGWNLP 
H41Studel SGTKVTFAS G NGTTATVSKD DQGN I TVKYD VNVGDALNVN QLQNSGWNLD 
PMC21Bgldel SGTNVTFASG KGTTATVSKD DQGNITVMYD VNVGDALNVN QLQNSGWNLD 
PMC21C1C5 SGTNVTFAS3 KGTTATVSKD DQGNITVMYD VNVGDALNVN QLQNSGWNLD 

C5 
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401 450 
SKAVAGSSGK VISGNVSPSK GKMDETVN I N AGNNIEITRN GKNIDIATSM 
SKAVAGSSGK VISGNVSPSK GKMDETVNIN AGNNIEITRN GKNIDIATSM 
SKAVAGSSGK VISGNVSPSK GKMDETVNIN AGNNIEITRN GKNIDIATSM 
PMC21Bgldel SKAVAGSSGK VISGNVSPSK GKMDETVNIN AGNNIEITRN GKNIDIATSM 
PMC21C1C5 SKAVAGSSGK VISGNVSPSK GKMDETVNIN AGNNIEITRN GKNIDIATSM 

C5 



H41 
PMC21 
H41Studel 



H41 
PMC21 
H41Studel 
PMC21Bgldel 
PMC21C1C5 



451 500 
TPOFSSV SLG AGADAPTLSV DDEGALNVGS KDANKPVRIT NVAPGVKEGD 
TPOFSSV SLG AGADAPTLSV PG.DALNVGS KKDNKPVRIT NVAPGVKEGD 
TPQFSSVSLG AGADAPTLSV DDEGALNVGS KDANKPVRIT NVAPGVKEGD 
TPOFSSVSLG AGADAPTLSV DG.DALNVGS KKDNKPVRIT NVAPGVKEGD 
TPOFSSVSLG AGADAPTLSV DG.DALNVGS KKDNKPVRIT NVAPGVKEGD 

C5 



H41 
PMC21 
H41Studel 
PMC21Bgldel 
PMC21C1C5 



501 

VTNVAQLKGV 



550 

AQNLNNRIDN VNGNARAG I A OA I AT AG L VQ AYLPGKSMMA 



VTNVAOLKGV r AQNLNNRI DN VDGNARAG I A OAIATAGLVO AYLPGKSMMA 
VTNVAOLKGV AQNLNNRIDN VNGNARAG I A QA I AT AG L VQ AYLPGKSMMA 
VTNVAQLKGV AQNLNNRIDN VDGNARAG I A QAIATAGLVQ AYLPGKSMMA 
VTNVAQLKGV AQNLNNRIDN VDGNARAG I A QAIATAGLVQ AYLPGKSMMA 

C5 



551 600 
H41 IGGGTYLGEA GYAIGYSSIS AGGNWI IKGT ASGNSRGHFG ASASVGYOW . 
PMC 21 IGGGTYRGEA GYAIGYSSIS DGGNWI IKGT ASGNSRGHFG ASASVGYOW . 
H4 1Studel IGGGTYLGEA GYAIGYSSIS AGGNWI I KGT ASGNSRGHFG ASASVGYQW. 
PMC21Bgldel IGGGTYRGEA GYAIGYSSIS DGGNWI IKGT ASGNSRGHFG ASASVGYQW. 
PMC21C1C5 IGGGTYRGEA GYAIGYSSIS DGGNWI I KGT ASGNSRGHFG ASASVGYQW . 

C5 
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m 



52 


NNEEQEEYL 


YLHPVQRTVA 


VLIVNSDKEG 


AGEKEKVEEN 


SDWAVYFNEK 


101 


GVLTAREITL 


KAGDNLKIKQ 


NGTNFTYSLK 


KDLTDLTSVG 


TEKLSFSAHG 


151 


NKVNITSDTK 


GLN FAKE TAG 


TNGDTTVHLN 


GIGSTLTDTL 


LNTGATTNVT 


201 


NDNVTDDEKK 


RAASVKDVLN 


AGWNIKGVKP 


GTTASDNVDF 


VRTYDTVEFL 


4. J 1 


9ADTKTTTVN 


VESKDNGKKT 


EVKIGAKTSV 


IKEKDGKLVT 


GKDKGENG55S 


301 


TDEGEGLVTA 


KEVIDAVNKA 


GWRMKTTTAN 


GQTGQADKFE 


TVTSGTNVTF 


351 


ASGKGTTATV 


SKDDQGNITV 


MYDVNVGDAL 


NVNQLQNSGW 


NLDSKAVAGS 


401 


SGKVISGNVS 


PSKGKMDETV 


NINAGNNIEI 


TRNGKNIDIA 


TSMTPQFSSV 


451 


SLGAGADAPT 


LSVDGDALNV 


GSKKDNKPVR 


ITNVAPGVKE 


GDVTNVAQLK 


501 


GVAQNLNNRI 


DNVDGNARAG 


IAQAIATAGL 


VQAYLPGKSM 


MAIGGGTYRG 


551 


EAGYAIGYSS 


ISDGGNWIIK 


GTASGNSRGH 


FGASASVGYQ 


W* 
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52 


TDEDEEEEL 


ESVQRSVVGS 


IQASMEGSVE 


LETISLSMTN 


DSKEFVDPYI 


101 


WTLKAGDNL 


KIKQNTNENT 


NASSFTYSLK 


KDLTGLINVE 


TEKLSFGANG 


151 


KKVNI ISDTK 


GLN FAKE TAG 


TNGDTTVHLN 


GIGSTLTDML 


LNTGATTNVT 


201 


NDNVTDDEKK 


RAASVKDVLN 


AGWNIKGVKP 


GTTASDNVDF 


VRTYDTVEFL 


251 


SADTKTTTVN 


VESKDNGKKT 


EVKIGAKTSV 


IKEKDGKLVT 


GKGKGENGSS 


301 


TDEGEGLVTA 


KEVIDAVNKA' GWRMKTTTAN 


GQTGQADKFE 


TVTSGTKVTF 


351 


ASGNGTTATV 


SKDDQGNITV 


KYDVNVGDAL 


NVNQLQNSGW 


NLDSKAVAGS 


401 


SGKVISGNVS 


PSKGKMDETV 


NINAGNNIEI 


TRNGKNIDIA 


TSMTPQFSSV 


451 


SLGAGADAPT 


LSVDDEGALN 


VGSKDANKPV 


RITNVAPGVK 


EGDVTNVAQL 


501 


KGVAQNLNNR 


I DNVNGNARA 


GIAQAIATAG 


LVQAYLPGKS 


MMAIGGGTYL 


551 


GEAGYAIGYS 


SISAGGNWII 


KGTASGNSRG 


HFGASASVGY 


QW* 
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52 * NNETDLTSV GTEKLSFSAN GNKVNITSDT KGLNFAKETA GTNGDTTVHL 
101 NGIGSTLTDT LLNTGATTNV TNDNVTDDEK KRAASVKDVL NAGWNIKGVK 
151 PGTTASDNVD FVRTYDTVEF LSADTKTTTV NVESKDNGKK TEVKIGAKTS 



251 


NGQTGQADKF 


ETVTSGTNVT 


FASGKGTTAT 


VSKDDQGNIT 


VMYDVNVGDA 


301 


LNVNQLQNSG 


WNLDSKAVAG 


SSGKVISGNV 


SPSKGKMDET 


VNINAGNNIE 


351 


ITRNGKNIDI 


ATSMTPQFSS 


VSLGAGADAP 


TLSVDGDALN 


VGSKKDNKPV 


401 


RITNVAPGVK 


EGDVTNVAQL 


KGVAQNLNNR 


I DNVDGNARA 


GIAQAIATAG 


451 


LVQAYLPGKS 


MMAIGGGTYR 


GEAGYAIGYS 


SISDGGNWII 


KGTASGNSRG 


501 


HFGASASVGY 


QW* 









52 


TDETGLINV 


ETEKLSFGAN 


GKKVNI ISDT 


KGLNFAKETA 


GTNGDTTVHL 


101 


NGIGSTLTDM 


LLNTGATTNV 


TNDNVTDDEK 


KRAASVKDVL 


NAGWNIKGVK 


151 


PGTTASDNVD 


FVRTYDTVEF 


LSADTKTTTV 


NVESKDNGKK 


TEVKIGAKTS 


201 


VIKEKDGKLV 


TGKGKGENGS 


STDEGEGLVT 


AKEVI DAVNK 


AGWRMKT T T A 


251 


NGQTGQADKF 


ETVTSGTKVT 


FASGNGTTAT 


VSKDDQGNIT 


VKYDVNVGDA 


301 


LNVNQLQNSG -WNLDSKAVAG 


SSGKVISGNV 


SPSKGKMDET 


VNINAGNNIE 


351 


ITRNGKNIDI 


ATSMTPQFSS 


VSLGAGADAP 


TLSVDDEGAL 


NVGSKDANKP 


401 


VRITNVAPGV 


KEGDVTNVAQ 


LKGVAQNLNN 


RIDNVNGNAR 


AGIAQAIATA 


451 


GLVQAYLPGK 


SMMAIGGGTY 


LGEAGYAIGY 


SSISAGGNWI 


IKGTASGNSR 


501 


GHFGASASVG 


YQW* 









201 



VIKEKDGKLV TGKDKGENGS 



STDEGEGLVT AKEVI DAVNK AGWRMKT T T A 
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52 NNVDFVRTY DTVEFLSADT KTTTVNVESK DNGKKTEVKI GAKTSVIKEK 
101 DGKLVTGKDK GENGSSTDEG EGLVTAKEVI DAVNKAGWRM KTTTANGQTG 
151 QADKFETVTS GTNVTFASGK GTTATVSKDD QGNITVMYDV NVGDALNVNQ 
201 * LQNSGWNLDS KAVAGSSGKV ISGNVSPSKG KMDETVNINA GNNIEITRNG 
251 KNIDIATSMT PQFSSVSLGA GADAPTLSVD GDALNVGSKK DNKPVRITNV 
301 APGVKEGDVT NVAQLKGVAQ NLNNRIDNVD GNARAGIAQA IATAGLVQAY 
351 LPGKSMMAIG GGTYRGEAGY AIGYSSISDG GNWIIKGTAS GNSRGHFGAS 
401 ASVGYQW* 

FIG. 14E 





52 


NRAASVKDV 


LNAGWNIKGV 


KPGTTASDNV 


DFVRTYDTVE 


FLSADTKTTT 


SJ 


101 


VNVESKDNGK 


KTEVKIGAKT 


SVIKEKDGKL 


VTGKDKGENG 


SSTDEGEGLV 


w 


151 


TAKE VI DAVN 


KAGWRMKTTT 


ANGQTGQADK 


FETVTSGTNV 


TFASGKGTTA 


EG 

!ni n 
5 


201 


TVSKDDQGNI 


TVMYDVNVGD 


ALNVNQLQNS 


GWNLDSKAVA 


GSSGKVISGN 


,„ O? 


251 


VSPSKGKMDE 


TVNINAGNNI 


EITRNGKNID 


IATSMTPQFS 


SVSLGAGADA 


301 


PTLSVDGDAL 


NVGSKKDNKP 


VRITNVAPGV 


KEGDVTNVAQ 


LKGVAQNLNN 




351 


RIDNVDGNAR 


AGIAQAIATA 


GLVQAYLPGK 


SMMAIGGGTY 


RGEAGYAIGY 


u 


401 


SSISDGGNWI 


IKGTASGNSR 


GHFGASASVG 


YQW* 
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50 SANTLKAGDNL KIKQFTYSLK KDLTDLTSVG TEKLSFSANG NKVNITSDTK 

101 GLNFAKETAG TNGDTTVHLN GIGSTLTDRA ASVKDVLNAG WN I KGVKNVD 

151 FVRTYDTVEF LSADTKTTTV NVESKDNGKK TEVKIGAKTS VIKEKDGKLV 

201 TGKDKGENGS STDEGEGLVT AKEVI DAVNK AGWRMKTTTA NGQTGQADKF 

251 ETVTSGTNVT FASGKGTTAT VSKDDQGNIT VMYDVNVGDA LNVNQLQNSG 

301 WNLDSKAVAG SSGKVISGNV SPSKGKMDET VNINAGNNIE ITRNGKNIDI 

351 ATSMTPQFSS VSLGAGADAP TLSVDGDALN VGSKKDNKPV RITNVAPGVK 

401 EGDVTNVAQL KGVAQNLNNR I DNVDGNARA GIAQAIATAG LVQAYLPGKS 

451 MMAI GGGTYR GEAGYAIGYS SISDGGNWII KGTASGNSRG HFGASASVGY 

501 QW* 



FIG. 14G 



